Main goals of the
session
Learn how to (1) simulate genetic drift and natural selection (2) to infer the effect of genetic drift and natural selection on the temporal dynamic of genetic variation by simulation
To use external R packages to solve the previous objectives
learnPopGenWe are going to use the R library learnPopGen to
simulates drift and natural selection at a single biallelic locus to
study two classic evolution examples: the evolution of the
peppered moth –an evolutionary instance of directional color
change in the moth population as a consequence of air pollution during
the Industrial Revolution– and the case of sickle cell anemia in
humans –an example of balancing selection, where the carriers
of the sickle cell allele are resistant to malaria even though they have
some anemia–.
We will use two functions: selection() –function that
performs numerical analysis of a simple biallelic selection model– and
drift.selection() –simulates drift and natural selection at
a single biallelic locus within one or various populations.
In this practical, we will see how natural selection acts in a finite size population. Both genetic drift and natural selection play an important role in determining how allelic frequencies evolve. By adjusting the population size and relative biological effectiveness of various genotypes, the interaction of these two evolutionary forces can be studied.
Genetic drift tends to eliminate genetic variation by fixing one of the alleles to the detriment of the others. In any population of limited size, one allele will eventually increase to fixation at a rate dependent on population size.
Selection can maintain or eliminate genetic variation. Selection in favor of heterozygous genotypes creates a balanced polymorphism, but selection in favor of one of the homozygotes eliminates variation in deterministic models (without genetic drift).
When genetic selection and drift operate together, there is an opposition of evolutionary forces if the heterozygote is the fittest, with selection acting to maintain variation and genetic drift promoting its elimination. The predominant force will depend on the relative strength of the genetic drift and natural selection. Genetic drift is very strong when the population size is small, and weak if the population is large. Selection is strong if the parameters of fitness viability (w) differ considerably, and is weak if the values of w are similar to each other.
In this practical, we are going to study the effect of natural selection (without genetic drift) and directional selection with genetic drift with the case of the peppered moth (Biston betularia) and industrial melanism and balancing selection with genetic drift with the case of anemia and malaria resistance.
The general selection model describes how a population’s allele frequencies will change when acted upon by natural selection.
| AA | Aa | aa | ||
|---|---|---|---|---|
| Genotype frequency | p2 | 2pq | q2 | 1 |
| Fitness viability | wAA | wAa | waa | |
| Proportion after selection | p2wAA | 2pqwAa | q2waa | w̄ |
| Genotype frequencies after selection (normalized) | p2wAA/w̄ | 2pqwAa/w̄ | q2waa/w̄ | 1 |
The idea is that we start our genotypes in Hardy-Weinberg equilibrium in the offspring. each genotype has an associated fitness value, which is where selection enters the picture: multiply the fitness value by the genotype frequency to get the relative contribution of that genotype to the adults. We define the average fitness of the population (={w}) to the be the sum of the relative contributions. Divide each relative contribution by the average fitness to get the normalized frequency of each genotype in the adults.
To formulate a general expression for genotype and allele frequency change under directional selection, we need to introduce two important parameters: the selection coefficient (s) and the degree of dominance (h). The selection coefficient is the selective disadvantage of the disfavored allele and is calculated as one minus the relative fitness of the homozygote for the disfavored allele (waa = 1-s). Because selection acts on phenotypes, we also need to account for the level o dominance in the expression of the fitness. When h = 0, relative fitness of AA, Aa and aa is 1, 1 and 1-s respectively, so that a is recessive and A is dominant. When h = 1, a is complete dominant over A, and the relative fitness of AA, Aa and aa is 1, 1-s and 1-s. When h = 1/2, there is no dominance in the expression of fitness.
| wAA | wAa | waa |
|---|---|---|
| 1 | 1-hs | 1-s |
Load the learnPopGen library. We are going to use the
selection() function to study the expected theoretical
behaviour of the selection model. This function takes three arguments:
p0 as the starting frequency for the A allele,
w as a vector of the fitnesses for the three genotypes in
the following order: AA, Aa, aa and time as the number of
generations to run the analysis. Using s = 5% and
p0 = 0.005, demonstrate if the following predictions are
true or false:
We will illustrate this section with a classic example of population genetics: selection for industrial melanism in peppered moths (dominant allele selected in favor). Industrial melanism provides one of the best observed examples of evolutionary change due to natural selection and shows that natural selection can be a powerful force. The adults of Biston betularia are white in color, inhabiting forests, feeding at night and resting during the day on white lichen-covered logs. This lepidopteran can easily camouflage itself on these trees. During the English Industrial Revolution, the habitat of this nocturnal moth was drastically affected. The use of poor quality coals released a large amount of pollution into the atmosphere, which was carried by the winds to various destinations. This contamination resulted in the blackening of the logs on which they rested. There are several mutations in this species that are darker than the common in Biston. During this period, these blacker individuals (melanic forms) began to be more frequent, being able to better camouflage themselves on the now black trunks. In contrast, white individuals became less well adapted, since their natural predators could more easily identify them by contrasting with the blackened color of the trunk (Figure 1).
(A) Map of the United Kingdom with graphs indicating the frequency of the melanic forms (black color), slightly melanic or insular (gray color) and non melanic or typical (white color) of the moths in 1950. The melanic form had high frequencies in industrial areas (Midlands, around London in the southeast and around Glasgow in the northwest), and low frequencies in less polluted areas. (B) Camouflage in the typical way on white logs before the industrial revolution. (C) Camouflage of the melanic form on the black trunks during the industrial revolution.
The frequency of the melanic forms increased from a frequency of the M allele from p~0.005 in 1848 to p~0.776 in 1900 (52 generations). The selective model is:
| wMM | wMt | wtt |
|---|---|---|
| 1 | 1 | 1-s |
Through trial and error, find a selection coefficient (s) that
explains this evolutionary event considering that the genetic drift is
not acting, using the selection() function. For example,
start with a fitness for tt (wtt) of 0.5, and see what
happens trying other values.
After 1995, the frequency of the M allele has decreased to less than 20% following the enactment in England of anti-pollution legislation and the obligation to use mainly electrical or non-polluting energy sources. Could we claim that evolutionary change due to natural selection is reversible? Justify your answer.
Next, we are going to add to the previous example (adaptive evolution with strong selection in favor of the melanic form) the stochastic effect of genetic drift. For this we are going to assume that the melanic form exists initially in a single mutant (heterozygous) individual in the population (p0~0.005 in a population of 100 individuals). Fisher (1930) predicted that the survival probability of a new mutant is twice its selection coefficient (at least when s is small).
To introduce the effect of genetic drift, now we will use the
drift.selection() function. This takes five arguments:
p0 as the starting frequency for the A allele,
w as a vector of the fitnesses for the three genotypes in
the following order: AA, Aa, aa, time as the number of
generations to run the analysis, Ne as the effective
population size and nrep as the number of replicate
simulations.
Simulate the proposed situation and write your results in the table. In what proportion of cases is the melanic allele (M) lost even though it is advantageous? Do your results support Fisher’s hypothesis? Make at least 20 replicates of this simulation.
Compare these results with the case of infinite population size (presented in exercise 2).
| Simulation | M is lost (p=0) | Polymorphic | M is fixed (p=1) |
|---|---|---|---|
| p0=0.005, Ne=100 |
Finally, we will illustrate the situation of strong selection in favor of the heterozygote in a finite population with one of the cases analyzed in more detail in the entire history of population genetics. This is the locus that encrypts the \(\beta\) chain of hemoglobin. In current human populations, there are several alleles at this locus, among which allele A is the majority. The S allele is the cause, in homozygotes, of the disease known as sickle cell anemia that erythrocytes adopt in affected individuals (Figure 2). These suffer a severe hemolytic anemia that usually causes their death before reaching adulthood. In sub-Saharan African populations, there is a surprisingly high frequency of the S allele. In 1949, J. B. S. Haldane suggested that this high frequency was due to heterozygous AS individuals not suffering from severe sickle cell anemia, while presenting some resistance to malaria. Haldane based his proposal on the fact that the frequency of the S allele was higher in those areas where malaria was endemic.
Allison (1956) analyzed the frequencies of the three genotypes in children and adults from the following counts:
| AA | AS | SS | |
|---|---|---|---|
| Number of children | 189 | 89 | 9 |
| Number of adult | 400 | 249 | 5 |
Based on the data in the table above, is there evidence to corroborate or reject the hypothesis that the AS genotype confers resistance to malaria? Calculate the frequencies of the three genotypes in children and adults. Do genotypic frequencies in children differ from those expected according to the Hardy-Weinberg equilibrium? And in adults? Try to explain your answer.
Hint: Use the
test_HWE() function of P1.
Calculate the relative survival rates (dividing the frequency of each genotype in adults by the corresponding frequency in children), the relative biological efficiencies of each genotype in relation to that of the heterozygote, and the selection coefficients against the homozygote AA (s1) and against the homozygous SS (s2).
| wMM | wMt | wtt |
|---|---|---|
| 1-s1 | 1 | 1-s2 |
| AA | AS | SS | |
|---|---|---|---|
| Number of children | 189 | 89 | 9 |
| Number of adult | 400 | 249 | 5 |
| Frequency in children | |||
| Frequency in adults | |||
| Relative survival | |||
| Relative fitness | |||
| Selective coefficients (s1 and s2) |
Use the drift.selection() function to simulate allele
frequency trajectories in the case of sickle cell anemia using the
relative biological efficiencies that you have calculated in exercise 6.
Start the simulations with an initial S allele frequency of 0.1 (initial
A allele frequency = 0.9), for 100 generations, with (i) a large
population (drift does not have a significant effect), (ii) N = 100, and
(iii) N = 10. Make at least 20 replicates of each simulation and compare
the results with the three population sizes. What is the effect of
genetic drift on the dynamics of allele frequencies at a locus with
strong selection in favor of the heterozygote? Justify your answer.
| Simulation | A is lost (p=0) | Polymorphic | A is fixed (p=1) |
|---|---|---|---|
| N=500 | |||
| N=100 | |||
| N=10 |
Try to summarize what have you learnt in this practical.
Deliver this document in Aul@-ESCI with your answers
Deadline: 6 May 2024
Doubts? marta.coronado@uab.cat and olga.dolgova@crg.es.